Comparison of LZ77-type Parsings
نویسندگان
چکیده
We investigate the relations between different variants of the LZ77 parsing existing in the literature. All of them are defined as greedily constructed parsings encoding each phrase by reference to a string occurring earlier in the input. They differ by the phrase encodings: encoded by pairs (length + position of an earlier occurrence) or by triples (length + position of an earlier occurrence + the letter following the earlier occurring part); and they differ by allowing or not allowing overlaps between the phrase and its earlier occurrence. For a given string of length n over an alphabet of size σ, denote the numbers of phrases in the parsings allowing (resp., not allowing) overlaps by z (resp., ẑ), for “pairs”, and by z3 (resp., ẑ3), for “triples”. We prove the following bounds and provide series of examples showing that these bounds are tight: • z ≤ ẑ ≤ z · O(log n z log σ z ) and z3 ≤ ẑ3 ≤ z3 ·O(log n z3 logσ z3 ); • 1 2 ẑ < ẑ3 ≤ ẑ and 1 2z < z3 ≤ z.
منابع مشابه
15418 Parallel Computer Architecture and Programming Project Report: Implementation and Comparison of Parallel LZ77 and LZ78 Algorithms
Our project developed an linear work parallel LZ77 compression algorithm with unlimited window size. We also implemented two version of LZW compression algorithms. We did cross comparison of all algorithms and gave suggestions on how to choose an algorithm for real application.
متن کاملA Low-Power CAM Design for LZ Data Compression
ÐLow-power and high-performance data compressors play an increasingly important role in the portable mobile computing and wireless communication markets. Among lossless data compression algorithms for hardware implementation, LZ77 is one of the most widely used. For real-time communication, some hardware LZ compressors/decompressors have been proposed in the past. Content addressable memory (CA...
متن کاملLempel-Ziv factorization: Simple, fast, practical
For decades the Lempel-Ziv (LZ77) factorization has been a cornerstone of data compression and string processing algorithms, and uses for it are still being uncovered. For example, LZ77 is central to several recent text indexing data structures designed to search highly repetitive collections. However, in many applications computation of the factorization remains a bottleneck in practice. In th...
متن کاملBicriteria data compression
The advent of massive datasets and the consequent design of high-performing distributed storage systems—such as BigTable by Google [7], Cassandra by Facebook [5], Hadoop by Apache—have reignited the interest of the scientific and engineering community towards the design of lossless data compressors which achieve effective compression ratio and very efficient decompression speed. Lempel-Ziv’s LZ...
متن کاملApplying Compression to a Game's Network Protocol
This report presents the results of applying different compression algorithms to the network protocol of an online game. The algorithm implementations compared are zlib, liblzma and my own implementation based on LZ77 and a variation of adaptive Huffman coding. The comparison data was collected from the game TomeNET. The results show that adaptive coding is especially useful for compressing lar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1708.03558 شماره
صفحات -
تاریخ انتشار 2017